Automatic Speech Recognition Errors as a Predictor of L2 Listening Difficulties
نویسندگان
چکیده
This paper investigates the use of automatic speech recognition (ASR) errors as indicators of the second language (L2) learners’ listening difficulties and in doing so strives to overcome the shortcomings of Partial and Synchronized Caption (PSC) system. PSC is a system that generates a partial caption including difficult words detected based on high speech rate, low frequency, and specificity. To improve the choice of words in this system, and explore a better method to detect speech challenges, ASR errors were investigated as a model of the L2 listener, hypothesizing that some of these errors are similar to those of language learners’ when transcribing the videos. To investigate this hypothesis, ASR errors in transcription of several TED talks were analyzed and compared with PSC’s selected words. Both the overlapping and mismatching cases were analyzed to investigate possible improvement for the PSC system. Those ASR errors that were not detected by PSC as cases of learners’ difficulties were further analyzed and classified into four categories: homophones, minimal pairs, breached boundaries and negatives. These errors were embedded into the baseline PSC to make the enhanced version and were evaluated in an experiment with L2 learners. The results indicated that the enhanced version, which encompasses the ASR errors addresses most of the L2 learners’ difficulties and better assists them in comprehending challenging video segments as compared with the baseline.
منابع مشابه
ASR technology to empower partial and synchronized caption for L2 listening development
This study introduces a tool, partial and synchronized caption (PSC), for training second language (L2) listening skill. PSC uses an automatic speech recognition (ASR) system to realize word-level alignment between text and speech while it refers to the corpora to effectively select a subset of words for inclusion in the caption. The selection criteria are based on three features contributing t...
متن کاملSyllable structure affects second-language spoken word recognition and production
In this study, we show that second-language (L2) spoken-word recognition is greatly influenced by syllable-structure differences between the native language (L1) and the second language (L2), and that L2 word-recognition accuracy is a reliable predictor of L2 word-production accuracy. Spanish-speaking English learners (experimental group) completed a listening task in which they monitored /(ǝ)s...
متن کاملPartial and synchronized captioning: A new tool for second language listening development
This study investigates a novel method of captioning, partial and synchronized, as a listening tool for second language (L2) learners. In this method, the term partial and synchronized caption (PSC) pertains to the presence of a selected set of words in a caption where words are synced to their corresponding speech signal, using a state-of-the-art automatic speech recognition (ASR) technology. ...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملSingle-Ended Prediction of Listening Effort Based on Automatic Speech Recognition
A new, single-ended, i.e. reference-free measure for the prediction of perceived listening effort of noisy speech is presented. It is based on phoneme posterior probabilities (or posteriorgrams) obtained from a deep neural network of an automatic speech recognition system. Additive noisy or other distortions of speech tend to smear the posteriorgrams. The smearing is quantified by a performance...
متن کامل